[Enhancement] Use weighted ranking to cap refinement candidates (CF-931)#962
Conversation
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Explore these optional code suggestions:
|
⚡️ Codeflash found optimizations for this PR📄 115% (1.15x) speedup for
|
KRRT7
left a comment
There was a problem hiding this comment.
I really like this, just implement the changes that the PR review bot gave you
…efined-candidates
|
I suspect this will help with the long runtimes of our tracer-replay as well now that I think of it |
| # Refinement | ||
| REFINE_ALL_THRESHOLD = 2 # when valid optimizations count is 2 or less, refine all optimizations | ||
| REFINED_CANDIDATE_RANKING_WEIGHTS = (2, 1) # (runtime, diff), runtime is more important than diff by a factor of 2 | ||
| TOP_N_REFINEMENTS = 0.45 # top 45% of valid optimizations (based on the weighted score) are refined |
There was a problem hiding this comment.
any reason for this number?
There was a problem hiding this comment.
nothing in particular, was thinking of making it a fixed number, maybe 3 ?
@misrasaurabh1 @KRRT7 @aseembits93
| return [v / total for v in importance.values()] | ||
|
|
||
|
|
||
| def normalize(values: list[float]) -> list[float]: |
There was a problem hiding this comment.
can you rename this function to min_max_normalize? normalize is too broad
| weights = choose_weights(runtime=runtime_w, diff=diff_w) | ||
|
|
||
| runtime_norm = normalize(runtimes_list) | ||
| diffs_norm = normalize(diff_lens_list) |
There was a problem hiding this comment.
i am wondering if min_max_normalization for these are a good idea.
With this, every code with minimal runtime or diff_len will have a weighted value of 0. Every maximal will have a value of 1. It won't matter even if the difference between the min and the max is miniscule.
The problem i see is that min-max normalization gets rid of the relative scale of the runtime or the diff lens.
Instead of normalizing with min = minimal data point, why not try with min = 0? Diff len or runtime can only ever be as small as 0, and with this formulation we can think of the values as a vector emanating from origin and we give the largest datapoint a value of 1 and the minimal one as some number relative to the maginitude b/w 0 and the max number. So if the runtime is half of max, then the score of 0.5 sounds reasonable rather than 0.
This preserve a sense of scale
There was a problem hiding this comment.
nice, this is definitely more accurate
|
@mohammedahmed18 lets merge this. you should keep an eye out to see if the filtering still makes sense with the new ranking logic |
PR Type
Enhancement
Description
Rank refinements by runtime and diff
Add normalization and weighting utilities
Change refinement request payload to ints
Cap refinements to top 45% candidates
Diagram Walkthrough
File Walkthrough
aiservice.py
Humanize runtime fields; trim logscodeflash/api/aiservice.py
code_utils.py
Utilities for weighting and normalizationcodeflash/code_utils/code_utils.py
models.py
Refiner request runtime type to intcodeflash/models/models.py
function_optimizer.py
Weighted, selective, parallel refinement flowcodeflash/optimization/function_optimizer.py
config_consts.py
Config for weighted refinement cappingcodeflash/code_utils/config_consts.py